tr (Unix)

tr (abbreviated from translate or transliterate) is a command in Unix-like operating systems.

When executed, the program reads from the standard input and writes to the standard output. It takes as parameters two sets of characters, and replaces occurrences of the characters in the first set with the corresponding elements from the other set. For example,

tr 'abcd' 'jkmn'

maps 'a' to 'j', 'b' to 'k', 'c' to 'm', and 'd' to 'n'.

Sets of characters may be abbreviated by using character ranges. The previous example could be written:

tr 'a-d' 'jkmn'

In POSIX compliant versions of tr the set represented by a character range depends on the locale's collating order, so it is safer to avoid character ranges in scripts that might be executed in a locale different from that in which they were written. Ranges can often be replaced with POSIX character sets such as [:alpha:].

The -c flag complements the first set of characters.

tr -cd '[:alnum:]'

therefore removes all non-alphanumeric characters.

The -s flag causes tr to compress sequences of identical adjacent characters in its output to a single token. For example,

tr -s '\n' '\n'

replaces sequences of one or more newline characters with a single newline.

The -d flag causes tr to delete all tokens of the specified set of characters from its input. In this case, only a single character set argument is used. The following command removes carriage return characters, thereby converting a file in DOS/Windows format to one in Unix format.

tr -d '\r'

Most versions of tr, including GNU tr and classic Unix tr, operate on single byte characters and are not Unicode compliant. An exception is the Heirloom Toolchest implementation, which provides basic Unicode support.

Ruby and Perl also have an internal tr operator, which operates analogously. Tcl's string map command is more general in that it maps strings to strings while tr maps characters to characters.

See also

External links